Open Reading Group — Papersᵞ

Wellcome to Open Reading Group's library,

You can find here all papers liked or uploaded by Open Reading Group
together with brief user bio and description of her/his academic activity.

### Upcoming readings: No upcoming readings for now... ### Past Readings: - 21/05/2018 [Graph Embedding Techniques, Applications, and Performance: A Survey](https://papers-gamma.link/paper/52) - 14/05/2018 [A Fast and Provable Method for Estimating Clique Counts Using Turán’s Theorem](https://papers-gamma.link/paper/24) - 07/05/2018 [VERSE: Versatile Graph Embeddings from Similarity Measures](https://papers-gamma.link/paper/48) - 30/04/2018 [Hierarchical Clustering Beyond the Worst-Case](https://papers-gamma.link/paper/45) - 16/04/2018 [Scalable Motif-aware Graph Clustering](https://papers-gamma.link/paper/18) - 02/04/2018 [Practical Algorithms for Linear Boolean-width](https://papers-gamma.link/paper/40) - 26/03/2018 [New Perspectives and Methods in Link Prediction](https://papers-gamma.link/paper/28/New%20Perspectives%20and%20Methods%20in%20Link%20Prediction) - 19/03/2018 [In-Core Computation of Geometric Centralities with HyperBall: A Hundred Billion Nodes and Beyond](https://papers-gamma.link/paper/37) - 12/03/2018 [Diversity is All You Need: Learning Skills without a Reward Function](https://papers-gamma.link/paper/36) - 05/03/2018 [When Hashes Met Wedges: A Distributed Algorithm for Finding High Similarity Vectors](https://papers-gamma.link/paper/23) - 26/02/2018 [Fast Approximation of Centrality](https://papers-gamma.link/paper/35/Fast%20Approximation%20of%20Centrality) - 19/02/2018 [Indexing Public-Private Graphs](https://papers-gamma.link/paper/19/Indexing%20Public-Private%20Graphs) - 12/02/2018 [On the uniform generation of random graphs with prescribed degree sequences](https://papers-gamma.link/paper/26/On%20the%20uniform%20generation%20of%20random%20graphs%20with%20prescribed%20d%20egree%20sequences) - 05/02/2018 [Linear Additive Markov Processes](https://papers-gamma.link/paper/21/Linear%20Additive%20Markov%20Processes) - 29/01/2018 [ESCAPE: Efficiently Counting All 5-Vertex Subgraphs](https://papers-gamma.link/paper/17/ESCAPE:%20Efficiently%20Counting%20All%205-Vertex%20Subgraphs) - 22/01/2018 [The k-peak Decomposition: Mapping the Global Structure of Graphs](https://papers-gamma.link/paper/16/The%20k-peak%20Decomposition:%20Mapping%20the%20Global%20Structure%20of%20Graphs)

☆

New Perspectives and Methods in Link Prediction

Comments:

Nice reference on supervised link prediction and comparisons to unsupervised techniques, which gives a good overview of possible methods and their limits. The general idea is to put forward the relevance of a supervised approach to predict links. My main comment is that unsupervised and supervised approaches are so different that they are hardly comparable, and I guess that in most cases people would use supervised learning if their data allow them to do so... __Summary__ A few questions discussed in this paper: _Comparison unsupervised vs supervised_ Unsupervised: we use a score (for pairs of nodes) correlated to the existence of a link, e.g. the Adar-Adamic index, pairs are ranked according to this score, then the top score pairs are selected for prediction. Supervised: it turns to a standard binary classification task, with a ground truth to evaluate the quality of the results, so we can define usual evaluation metrics: precision, recall, ROC curve, etc. _The problem of class imbalance_ When comparing the number of links that actually exists ($M$ which is more or less of the order of magnitude of $N$ in real networks), to the number of possible links (in $N^2$), the ratio is in $\frac{1}{N}$. It implies that the class of existing links is much smaller than the class of possible links, making the classification task imbalanced. This problem tends to worsen with the size of the network. _Variance reduction_ With a supervised approach, it is possible to use ensemble frameworks to reduce variance and avoid overfitting (e.g. bagging, random forests). The broad idea is to transform the learning into an ensemble of learnings and average the results (typically via voting between classes). _Class of pairs_ To improve the prediction quality, they divide pairs into classes depending on the distance in the graph between nodes, then learn on each class independently. It reduces drastically the ratio (existing link) / (possible link) in each of these classes.

Open Reading Group at 2018-05-10 16:51:29

Read the paper, add your comments…

Papersγ / Open Reading Group

Wellcome to Open Reading Group's library,

Comments:

Comments:

Comments:

Papers^γ / Open Reading Group